Dataset Characteristics (Metafeatures)
نویسندگان
چکیده
Summary This chapter discusses dataset characteristics that play a crucial role in many metalearning systems. Typically, they help to restrict the search given configuration space. The basic characteristic of target variable, for instance, determines choice right approach. If it is numeric, suggests suitable regression algorithm should be used, while if categorical, classification used instead. provides an overview different types characteristics, which are sometimes also referred as metafeatures. These types, and include so-called simple, statistical, information-theoretic, model-based, complexitybased, performance-based last group has advantage can easily defined any domain. include, sampling landmarkers representing performance particular algorithms on samples data, relative capturing differences or ratios values providing estimates gains . final part this specific machine learning tasks, including classification, regression, time series, clustering.
منابع مشابه
Towards Automatic Generation of Metafeatures
The selection of metafeatures for metalearning (MtL) is often an ad hoc process. The lack of a proper motivation for the choice of a metafeature rather than others is questionable and may originate a loss of valuable information for a given problem (e.g., use of class entropy and not attribute entropy). We present a framework to systematically generate metafeatures in the context of MtL. This f...
متن کاملA Framework To Decompose And Develop Metafeatures
This paper proposes a framework to decompose and develop metafeatures for Metalearning (MtL) problems. Several metafeatures (also known as data characteristics) are proposed in the literature for a wide range of problems. Since MtL applicability is very general but problem dependent, researchers focus on generating specific and yet informative metafeatures for each problem. This process is carr...
متن کاملSupplementary material for: Initializing Bayesian Hyperparameter Optimization via Meta-Learning
To evaluate our approach in a realistic setting we implemented 46 metafeatures from the literature listed in Table 1.1 These metafeatures are computed only for the training set. While most of them can be computed for a whole dataset, some of them (e.g., skewness) are defined for each attribute of a dataset. In this case, we compute the metafeature for each attribute of the dataset and use the m...
متن کاملUsing Metafeatures to Increase the Effectiveness of Latent Semantic Models in Web Search
In web search, latent semantic models have been proposed to bridge the lexical gap between queries and documents that is due to the fact that searchers and content creators often use different vocabularies and language styles to express the same concept. Modern search engines simply use the outputs of latent semantic models as features for a so-called global ranker. We argue that this is not op...
متن کاملAutomatic Detection of Online Recruitment Frauds: Characteristics, Methods, and a Public Dataset
The critical process of hiring has relatively recently been ported to the cloud. Specifically, the automated systems responsible for completing the recruitment of new employees in an online fashion, aim to make the hiring process more immediate, accurate and cost-efficient. However, the online exposure of such traditional business procedures has introduced new points of failure that may lead to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Cognitive technologies
سال: 2022
ISSN: ['2197-6635', '1611-2482']
DOI: https://doi.org/10.1007/978-3-030-67024-5_4